# INT4 quantization
Gemma 3 27b It Quantized.w4a16
This is a quantized version of google/gemma-3-27b-it, supporting visual-text input and text output. Optimized through weight quantization and activation quantization, it enables efficient inference with vLLM.
Image-to-Text
Transformers

G
RedHatAI
302
1
Qwen3 30B A3B Quantized.w4a16
Apache-2.0
INT4 quantized version of Qwen3-30B-A3B, reducing disk and GPU memory requirements by 75% while maintaining high performance.
Large Language Model
Transformers

Q
RedHatAI
379
2
Qwen3 32B Quantized.w4a16
Apache-2.0
INT4 quantized version of Qwen3-32B, reducing disk and GPU memory requirements by 75% through weight quantization while maintaining high performance
Large Language Model
Transformers

Q
RedHatAI
2,213
5
Deepseek R1 Quantized.w4a16
MIT
INT4 weight-quantized version of DeepSeek-R1, reducing GPU memory and disk space requirements by approximately 50% while maintaining original model performance.
Large Language Model
D
RedHatAI
119
4
Gemma 3 12b It GPTQ 4b 128g
This model is an INT4 quantized version of google/gemma-3-12b-it, using the GPTQ algorithm to reduce parameters from 16-bit to 4-bit, significantly decreasing disk space and GPU memory requirements.
Image-to-Text
Transformers

G
ISTA-DASLab
1,175
2
Whisper Large V3 Turbo Quantized.w4a16
Apache-2.0
An INT4 weight quantization version based on openai/whisper-large-v3-turbo, supporting efficient audio-to-text tasks
Speech Recognition
Transformers English

W
RedHatAI
1,851
2
Mistral Small 3.1 24B Instruct 2503 GPTQ 4b 128g
Apache-2.0
This model is an INT4 quantized version of Mistral-Small-3.1-24B-Instruct-2503, using the GPTQ algorithm to reduce weights from 16-bit to 4-bit, significantly decreasing disk size and GPU memory requirements.
Large Language Model
M
ISTA-DASLab
21.89k
13
Gemma 3 27b It GPTQ 4b 128g
This model is an INT4 quantized version of gemma-3-27b-it, reducing disk and GPU memory requirements by decreasing the number of bits per parameter.
Image-to-Text
Transformers

G
ISTA-DASLab
32.15k
25
Featured Recommended AI Models